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ABSTRACT 

The author suggests that mathematical formulas could 
provide some direct and easily understandable frameworks for 
analogies in literary criticism. Most studies of textual problems, he 
points out, either have failed to use full mathematical models or 
have been reckless with the inherent limits of these techniques. 
Enumeration of linguistic traits is very common to literary analysis, 
and discussions of genre and text style based on linguistic data can 
be very informative. There are, however, serious dangers to be 
considered in the choice and size of text samples, and 
oversimplification of qualitative features should be avoided. He 
suggests that the most successful studies of this type involve a 
consideration of syllable and word structure rather than frequency of 
a literary device. In the text of this paper, the author examines 
various treatises on language usage in literary works, and comments 
on their value. His discussion of the applicability of mathematical 
formulas to literary analysis is quite technical. [Not available in 
hard copy due to marginal legibility of original document. ] (EB) 
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STATISTICS AND THU SOUNDS OF PCBTRY 
Richa.rd Wt Bailey 

So distinstiishe-a a humerdst as Klchard Schoeok has recezitlj- 
called nathenia tries to the attention of literary critics as a fertile 
soui'ce fer metaphors to iltedaate their craft. Using from 

such fields as topology and vector analyEiisp he shov;s that mv.ch current 
criticism f'dmbles for words to express relations like those hetwses 
author, work, and audience tliat isisht be clarified by an apt analogy 
from one or these disciplines. At the same time he cautions against 
the patfalls that i^esult from taking these metaphors too literally and 
ninglcoig the exactness and precision that they would ccea to offer with 
matters properly belonging to taste and judgment* »*There is more ihaji 
one literary parish,” he warns, "that possesses its eager spirit who 
proclaims the new cos^ter gospel with insufficient iziquiry into its 
limitations. We need careful snal-ysis, not hasty advertising” (Scliocck, 
1968:. 375 )* Some results froai •huLsanistic computing* clearly jiTstify 
h2.s skepticism; with the exception of two or three cases of disputed 
authorship concerning texts of minor literary importance, most studies 
of textual problems cither have failed to exploit the fiai power of 

arathematical models or have ania roughshod over* the inherent liieits of 
these techniques. 

While critics like SchoecI: tentatively oo^lore the possible impact 
of mathematics on their v/ork, linguists have been bolder in exploiti^rg 
mathematical models, and the use of quasi-aigebraic notation is now 
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coamoiiplaoe in accomits of laufiuags systoa. Statisticai aethods 
are also actoimdedgea to have application in desc.^,l^il^i^ la^ngnage use. 
William labov, among others, has emp}«,sised that the stigmate of social 



aialects are seldom so much a matter of aU-or-nothing as the Biblical 
shibbolv,th; the various dialects of English differ- much less in their 
underlffiag system of rules item the prophets of ‘bidialectalisra' and 
other forms of linguistic eagineeriag >fovad readily acknowledge. 
Mosuiatic differentiation — regional, social, literary, and so on 
is almost alvays a question of the typical uses of a ccawoaly shared 
qrstca. A. fullWliwAtof the varieties of language used is a 

community must resort to a statistical account of favored paths thr-ough 
a network of linguistic rules® 



Shough scorned by many (for example, Bllmann, 1 S 64 : 118-21), the 
onraieration of linguistic traits is actually ouch less foreign to the 
practice of literary critics than essays lilce Sohoeck's would seem to 
inply, Harry levin comments that "we need make no word-count to be 
eure that [Heminsway*s3 literary vocabulary, with foreign and technical 
axceptionw, consists of relatively few and short words" (1951s. 596). 

In so saying, he at least confronts the kind of question that statistical 
description of a text might pose for itself, but literary critics seldom 
verify each assertions by statistical means. With a few notable 
exceptions (e.g., leasltn 1970), the concern that critics shoiv for 
particulars is incompatible with the generalizing power of statistics. 

In bringing the Insights of the Prague School formalists to the attention 
of American critics, EcaS tfellek in his contribution to Theor.v ^ 

Jdtwatore drew attention to various interesting questions concerning 
genre and text style that might be illuminated by such means. Nevertheless, 
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this influential book drew critics to consider literary minutiae and their 
function 3ji the texture of a literary work, a tradition that emphasized 
the techniques of close reading initiated by explication de text© 
rather than the broader generalization involved in studying literary 
types in the context of aesthetic uses of language# Even as the 
influence of the *new criticism* espoused by Theory of Literature 
has begun to v/ane, critics still denigrate the value of so useful a 
tera as **periLOd style*’ in a continuing concern for the idiosyncratic 
and personal in a literary v/ork (see Chatman, 1966)« 

A full scale theory for discussing matters of style has recently 
been put forv;ard by Lubomir Dolezel in a paper entitled “A Framework 
for the Statistical Analysis of Style#** In this essay, Dolezel attempts 
to acknowledge the competing influences on a text of the author *s 
personality, the generic constraints on his choices enforced by custom 
and tradition, and the shaping inherent in the linguistic inedluci in 
vliich he writes# The realization of this scheme requires a thorough 
integration of linguistics, cultiiral history, literary judgment, and 
statistics# Here in fact is an e^qpllcit and mathematical metaphor for 
the forces impinging on a literary work; Dolezel, perhaps more 
thoroughly than Schoeck would find comfortable, has laid out the tasks 
typically fused in the broadest literary scholar in a program that may 
at first seem more palatable to the mathematician than to the 
belle-lettristic critic# With such a variety of potential influences 
on the work to consider, the narrowly ti'ainod academic may shy away 
from the standards of explicitness tliat Dolezel *s framework calls for 
or take refuge in the pervasive mathematical ludditism typical of 
literary men# 



As I ha.ve argued elsewhere (Bailey, 1969)* some of the questions 
of greatest concern to critics are ameriable to riiathematical treatment* 

Yet work of this kind is historically troubled by litertiry fatuity or 
ste.tistical ineptness } those v;ho undertake such research must confront 
two audiences and only seldom have they been able to satisfy both* 

Spurious cjtactness is as dangerous as the solipsism of hasty impressionism* 
A remark by S* B* Stoll, made a generation ago, rdght well serve as a 
rubric for all studies of this kindr ’’Error, v;hich in criticism doth 
so easily beset us, is, when in the guise of science and ai*med v;ith 
statistics, particularly insidious cind dangerous* It seems to, bu.t does 
not, put other error to flight: it is therefore in special need of 

detection” -(Stoll, 19^40:. 390). 

In attempting to carry out the tasks set by such theorists as 
Dolezel and Labov, one soon finds that even simple problems raise 

vexing questions about samples and sample sizes, about the treatment of 

% 

qualitative features of texts by statistical means, and about the choice 
of appropriate statistical tests to employ in evaluating the linguistic 
attributes extracted from the text* All of these issues played a 
part in the study to be discussed in this paper, a pro;jcct that emerged 
from a seminar in statistical stylistics held at the ^tate Un3.versity 
of New York at Buffalo ixi the summer of 1969*^ No startling ; . - ■ i 



IVedrick L* Eycr, and Professox' John M* Coetzee* Mr* Herbert B* Banford 
III provided invaluable progranmdng assistance in the early stages of the 
work, and advice on statistical methodology v;as freely given by Professor 
C* B* Bell* % interest in the field of statistical stylistics resulted 




in the seminar wez'e Miss Heide Marie Miller, Mr* 
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results of great literary or linguistic consequence v/ill cmergo in 
this essay , but it is tny hope that the strategy outlined here can be 
usefully app2jLed to questions of greater mcniont* 

Our work began in an exaraination of the relation between statistical 
pronjineiice and perceptual significance. '»Tlie analyst way forget, »» 
we are reminded by Th^^^ of »»that artistic effect and 

emphasis are not identical with the mere frequency of a device” (Wellek 
and V/arren, 1956; I?!)* In examining this issue, v;e turned to a study 
of foregrounding of segmental phonemes in highly orchestrated texts. 

To what extent could this phenomenon be attributed to the numerical, 
deployment of the sound resources of the language? Most handbook 
accounts of sound patterning in poems arc unsatisfying because of their 
failure to specify the whole range of poetic soimd effects characteristic 
of verse. A more careful taxonony, like that outlined by David I. lesson, 
Involves a consideration of syllable and word structure, line and 
syntactic patterjiing. jn the belief that we could cut av/ay these 
apparently unnecessaxy entit5.es, v/e restricted our study to the stream 
of sounds in a text, hoping that significant patterns would emerge 
without ackiiov/lcdging any other elements of linguistic or literary 
form. In doing so, we were encouraged by Jirl Levy's claim — derived 
from studies in several languages — that ”verso makes more use of the 



from the stimulating teaching of Professor Lubomir Dole¥el of the 
University of Toronto. Any defects of design or execution in this 
study are .solely the responsibility of the author. 
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typical souiids of a given language ajid suppresses the rare ones” 
(Lev^, 1967 s 99) ft Even more heady clain^s concerning the behavior 
of sound segments tempted us to pursue the v;orh; Marcello Boldrini, 
for example, asserts that “the use of a certain speech-sound is 
constant, or sli^itly variable in one author, but shov/s significajit 
differences between author and author* This conclusively px^oves 
the originality in the use of speech-sounds on the part of the 
poets” (Boldrini, 19^8:: 6‘0* Both LevJ and Boldrini appeal to 
extensive experiments to support their views j both of them, oiu* study 
suggests, were woag. The viev; that phonemic foregrounding is a 
matter of the frequency distribution of sound segments cannot stand* 
In casting about for a statistical technique that might yield 
Insight into the problem, v/e found tlmt information theory seemed to 
hold the greatest promise* Originally applied to problems in 
thermodynamics, information thepry has had v/idespread application 
in designing communications ^sterns (see Jackson, 1953i and Cherry, 
1956)5 it is particularly attractive as a mathematical metaphor for 
criticism since it provides the means to define the patterning hidden 
in a great variety of complex and apparently random phenomena, the 
figure concealed in the aesthetic carpet* . As a simile for the hidden 
structure in a chaotic set of events, information theory Ijas already 
found use in Thomas Inchon’s novel, C ryin;^ ^ I^t fjg, and in 
several attempts to account for artistic organisation including 
Abraham Moles* treatise, Informati on .^eory and E sthetic Perception * 
Furthermore, studies of acltnowledged worth have profited from the 
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mathematics of this field;- for example, A* M* Kondratov*s work on 
rhytlimical patterning in prose and poetry, and studies of vocabulary 
richness and grammatical organization made by Henry KuVera, Sober t 
S* V/aohal, and others* 

The texts chosen for our study v/ere transcribed In Trager-Sraith 

phonemic notation, a system chosen minly to allov/ comparisons vdth A* 

Hood Roberts* massive compilation of phonemic behavior, A Statistical 

Linguistic Analysis of American English * Vachel Lindsay’s heavily 

alliterative poem, ’’The Congo,” v/as transcribed in a form consonant 

with the dialect represented in Roberts* work, and tv/o texts already 

available in phonemic transcription were included to provide further 

comparisons: I^lan Thomas* assonantal poem, ”Rern Hill,” and a prose 

2 

passage often used in dialect analysis, ’’Grip the Rat*” These i^rorks, 
coupled v;ith Roberts' materials, permit an examination of Jan 
Mukarovsky's belief that ’’the standard language is the background 
against which is reflected the estlietically intentional distortion 
of the linguistic components of the work” (1933: l3)* 

Levy's claim that poets escjiew the rare sounds of a language 
was derived from work by Dolezel on grapheme distribution in Czech 
(Dolezel 1963)* The poem that Polezel ex8.mined * Kundera's ’’Monology” 
-• does support Levy's assertion, but further studies on Czech phonemes 
by Ludvikova and Kraus present a rather different picture* Their 



2 

The transcription of ’’Fern Hill” was taken, with minor modifications, 
from Loesch, 273-83; the transcription of ’’Grip the Rat” from Francis, 
159 - 60 * 
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resultS| derived from the application of information theory, are 
^reproduced in Table 1* The entropy values (K^) shown there are 

mm 

calculated by the follov/ing formtda in which the sample probabilities 
(^) are taken to represent the probabilities for the population: 

n 

PiloggPi* 

A low value for first*order entropy, reflects the tendency of 
some phonemes to occur with markedly greater frequency tlian others. 

In the case in which all the events measured occur with equal probability 
H. takes on a maxicjum value, the diadic logarithm of n: 



H 



loax 



log^ n 






In the ca.se of our English texts, a phonemic alphabet of thirty«tv;o 
symbols v/as used. If Eng3.ish sounds all occurred with equal frequency 
— as of course they do not •— the entropy value for this set 



of events would be log^ 32 or 5*0« On the other hand, as the frequency 
of particular phonemes increases at the expense of others, a successively 
lower Value for will result. To facilitate the comparison of texts 
with alphabets of varying sizes, it is also useful to. calculate the 
relative entropy, 





$ 

u 

max 



as well as its complement, the ’redundancy* t 
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Further details cone ernins, both the mathematical and the 15nguistic 
grounding for these calculations can be found in Gleason’s Introduction 
to Descriptive Linguistic s and in E* V« Paducheva’s “Information 
Theory and the Study of Langi!age®“ 

Bad LevJ been able to examine the results obtained by Ludvikova 
and Kraus I he might have modified his claim concerning the behavior 
of sound segments in poems# For extremely different text styles — - 
drama and technical prose — correspondingly different entropy 
values emerged# Yet poetry seems to be quite wnremarkabli^i in its 
deployment of sounds# Dole^^el’s calculation of a lov/ value for 
grapheme distribution in poetry (^#5722) appears to be anor t;lous in 
light of the v;ork of Ludvikova and Kraus, though a similar study by 

V , 

Stukovsky does suggest that graphic and phonemic segments vary 
according to genre# Nevertheless, the application of entropy measizres 
does not seem to be very productive in contrasting sound distribution 
in poems with that in other styles# 

An application of these measures to three Bi^anian poems has 

3 

recently been carried out by Solomon l^arcus# Though the poems are 
quite short, the relation of the entropy values calculated from them 
would seem to confirm an aesthetic judgment, for the low value in 
Table 2 for “La Mijloc de Codru” reflects the piling up of particular 
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Most of the slight differences between the values sho\m in 
Table Z and those published by Mai'cus are owing to misprints in his 
essay; 1 gratefully ackncwledge Dr# Marcus' correspondence concerning 
these matters# 
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soundG to produce a light, popular effect. The greater sonority of 
the third poem, »Se Bate VHqzvlI Noptii,» contributes to a meditative 
and philosopljiccil tone in wliich phonemic foregrounding plays a less 
significant role, a trait clearly represented in the higher value. 

2ii addition to these values for entropy and redundancy, l-feircus 
introduces an important measure of central tendency in (Qualitative 
variables, the repeat rate or •informational energy.* As Tlable 
2 shov/s, this value (E) varies inversely with the entropy, and it 
follows that the te^it v/xth the greatest repetition of 
phonemes will have the highest E characteristic. Octav Onicescu has 
shown tl'iat thxs easily calculated value, 

E « Up 2 , 

1 ^ 

is, like entropy, a useful indicator of the structuring of events in 
a pattern (see also Herdan, 1966, 271-73). 

The results of Marcus* study wouJld seem to indicate tliat information 
theory provides a reliable correlate of the aesthetic effects tliat 
readers recognize in the sound patterning in these three poems. 
Nevertheless, the task still remains of testing this jjituitive indication 
of significance against the possibility that differences in result 
from purely chance causes. Although the variables we. are considering 
are not independent, a testing pi’oeedure devised by G. P. Basharin 
provides the mathematical grounds for exploring the approximate 
significance of such differences. Both the estimate of the popu3.ation 
entropy and the variance (s^) can be calculated as follows, 



with N equal to the sample size in phonemes and n equal to the . 
size of the alphabet of symbols; 




With the help of these formulaoi it is possible to establish confidence 
tjoxmds for the dispersion of values# Taking a limit of #05 (lc96 s ) 
we can estimate the upper and lower values v;ithin v/hich 95 of every 
lOO samples v;ith distributions like those in the poems will vary on 
the basis of cliance# The results of these calculations are given 
in Table 3« Since the observed value for for **La Mi jloc de Codru** 
falls belov; the lower bounds for the other tv;o poemsy we can take this 
value to reflect a significant difference in the sound patterning of 
this poem# The observed values for ”Somaoroase iSe^ele” and **Se Bate 
Mlezul Moptii" both fall within the same range^ and no s5.gnificance can 
be ascribed to the difference between them at the #05 level of 
discrimination# 

In a parallel study of the difference between prose and poetry 
in Tamili Gift Siroraoney found a significant contrast of the values 
for graphemes in large samples , though he did s%gest that other 
testing proced\ircs than the one proposed by Basharin provide a more 
powerful discriminator of the two types# In a subsequent study of 
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in Telugu ja’cse, P» Balasubi'almmiyam and Siroaoney are much more 
pess5.crf,6t3.c about the utility of this measiire as an indication of 
stylistic diffei‘ences. A recomputation of the5r published results 
(given in Tables h and 5) suggests a significant difference in 
grapheme deployment between novels and short stories* v;hile the 
results for the other varieties of prose fall into the same range e 
Since the two linguists say little about the aesthetic properties of 
the various Telugu styles* it remains to be seen v^hether this measure 
has ar^y correlation with v/hat readers perceive about sound patteraiag 
in these texts* 

Our o\m if/ork v;ith these measures in examiniiig phoneciic distributions 
in English texts presents an even more mixed pictitt*e as Tables 6 and 
7 show* The observed ^ values for the three texts studied ax^e all 
hi^er than the ^*494? entropy figure calculated by Roberts for a huge 
sample of American English* Splitting '^The Congo" into three segments 
of approximately equal length* we note that is fairly stable for 
this text* But Levy's hypothesis that verse vrlll, in general* have 
a lov;er than other styles is clearly rejected by these data, for 
*.!Th© Congo" has a M<:^her value than either "Fern Kill" or "Grip the 
Rat*" "The Congo" does differ significantly from the other two texts* 
though not in the direction that Lev^ would lead us to expect* 

Furthex'inore* the vivid perceptual contrast between the higialy orchestrated 
"Fcm Hill" and the thoroughly mundane "Grip the Rat" is not at all 
clearly reflected in the results, of these calculations* 
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A decade ago, Roman Jalcobson made an eloquent pica for cooperation 
between 3jaformation theory and poetics, and the idea tliat significant 
results v/ould emerge from such an alliance has been repeated several 
times since* The study of sound patterning in poems, however, should 
apparently turn elsev;here — either to other statistical procedures or 
to a more complex hypothesis in which suprasegmentals, s^'llable boiaadaries, 

f 

or syntactic structure (if not all throe) are talien into account* The 
history of scholarship in this area reflects all too clearly Stoll *s 
warning against error *'in the guise of science and ai*med with statistics*” 
B 9 ldrlni*G claim that sound distribution is a possible authorship 
indicator is clearly refuted by the rcsu3-ts we have examined; Levy*s 
view that phoneaie frequencies in poems contrast with other text styles 
does not hold for Ibiglish and Czech and finds only jm^tial verification 
in Tamil and Rumanian* Entropy measurement proves to be a poor measure 
of the aesthetic significance of poetic sound patterning, hov/ever 
successful its other applications to linguistics or communication 
Bcience* 

Another strategy for approaching this problem without going beyond 
the level of segmental phwiemes was suggested by H* Spang-Kanssen*s 
paper, ”The Study of Gaps between Repetitions*” In highly alliterative 
verse, we surmised, like phonemes will tend to cluster closer together 
than in a prose text* In other words, while the overall frequency of 
sounds in an orchestrated text might not be cRrkedly different from 
what can be expected in prose, the gap distance from one instance of 
a given phoneme to the next might v;ell be shorter in the poem* ^y 
calculating the size of the intervals between phonemes, (from 0, adjoining 

ERIC 
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instances^ to a distance ~ 100) ^ v/e hoped to find a clearer neasure 
of sound patterning tlian emerged from the application of entropy 
measurement* 

To clarify the matter still further, two random sets of plicnemes 
were generated in which each phoneme in the inventory was select g'I 
independently at a chance of 1^32* The entropy values for the phoneme 
distribution in these artificial texts approached the maxicium value 
of ^*0, as expected, since in both random texts of 2,000 phonemes, sounds 
occur with approximate equiprobability* When entropy measures v;ere 
applied to the distribution of gap intervals in the texts, no pattern 
of the expected kind emerged (as Table 8 shows)* All the values for 
this feature in the English texts cluster close together, though the 
relationship of the three texts was for the first time in the order 
we expected* The alliterative text, ”The Congo,” did reveal a greater 
esqploitation of some gap intervals as reflected in its relatively low 
Hj ^(epaps) figure, 5*59^, while ”Eern Kill” and ’’Grip the Hat” came 
successively closer to the even distribution of gaps found in the 
random texts* Wide-ranging values for the three subsets of ”The 
Congo,” hov/ever, suggest that this feature is not particularly stable 
in the poem* fet the lov; value for the middle segment of the poem 
does reflect a significant feature of the text, the concentration of a 
repetitive refrain in this section — a trait that piles up a succession 
of like intervals and reduces the 2^^ value for gaps* ^“/hen these 
refrains are removed from the text, hov/ever, the ^ (gaps) 
approaches the figure achieved by the other tv;o texts, despite the 
clearly alliterative nature of the remaining lines of the pcem« 
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A further test of the behavior of gape in the tiiree texts v/as 
carried out with the lielp of the KolmogoroV'-Smiraov tv;o-saraple ^est 
(see Siegel, 1956: 127-36 )• Once again the condition of independence 
is not met, but the results of this tost can at least be taken as 
indicative of the relations betv/een the distributions* The greatest 
difference betv;eea the cumulative frequency distributions of the 
gap intervals, D, is given in Table 9 each pair of texts. If the 
observed value of D exceeds the value calculated by the following 
formula (where n = the number of gaps in the text), the difference 
may be taken as significant at a 93 % level of confidences 



1.36 




These calculated values appear in parenthesis in Table 9# ^^y 
the randomly constructed texts contrast significantly with the 
English texts in their distribution of gap intervals, though ”The 
Congo” and ”Fera Hill” are slightly different in their deployment of 
gaps. Nevertheless, this study of the sequential ordering of phonemes 
is no more revealing of the aesthetic differences between the texts 
than the probabilistic data subsumed in the entropy calculations 
turned out to be* 

In a third attempt to characterize the behavior of phonemic 
segments in these texts, we examined the rank-ordering of the sounds. 
Could the relative3-y high rank of /b/ in "The Congo” (the fourteenth 
most frequent phoneme) bo considered significant when compared to 
its i>osition of tvrenty-first in Roberts* list, tv/enty-second in "Fern 
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Kill," and tv;enty-third ia **Grip the Eat"? A measure of rank-coirelation, 
the Spearman rho test, yielded the results given in Table 11 (see Siegel, 

1956 1 202-15) All of the coefficients reveal a strong relationship 
beti/een the samples® "Grip the Eat" most resembles Roberts* ordering 
of phonemes, closely followed by "The Congo" and "Fern Hill*" Of 
particular significance is the very high correlation betv;een the subsets 
of "The Congo,*' a result that suggests that phonemic foregrounding 
exploits the same phonemic segments throughout the poem* Nevertheless, 
the rho values are much the same in comparisons between Roberts* ranking 
and the ranlcs in the texts* The striking differences between "Grip the 
Rat" and "The Congo" are still not vividly apparent front these calculations* 

As a result of our study of entropy, gap distribution, and ran!:-or dering, 
we are forced to conclude that the foregrounding of segmental phonemes 
cannot be specified by an examination of phonemic distribution alone* 

Stress placement, i^ntactic structure, and thematic organisation must 
be acknowledged in the formation of a hypothesis describing sound 
patterns in texts* A conclusion of this sort might easily have been 
anticipated at the outset of the study, of course, but the careful 
examination of the phenomenon in the simplest terms seemed plausible 
enough to make it v;orth testing* Furthermore, the statistical techniques 
illustrated here v;ill have value in testing more complicated hypotheses, 
for they illustrate the procedures that must be involved in a thorough-going 
treatment of laiiguage use* 

liKillc the strategies discussed so far have not yielded the results 
anticipated by our intuitive hypothesis, certain differences in the 
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phoneme distributions in the three texts still call for explanation^ 

As an interim measure for discussing these differences | v/e propose a 
•promSnence index* (P) to account for significant variations in 
radt-order in the author* s deplojonent of sound segments in texts® This 
index is based primarily on the ranlc difference betv/een a given 
phoneme in the text and the rank calculated for that phoneme in a 
representative body of American Bnglish by Roberts* The phoneme /cS/* 
for example, has a rank of twenty^three in Roberts* results and a rank 
of fourteen in **Grip the Eat,” a rank difference of +9* The phoneme 
/a/, on the other hand, is eighth most frequent in Roberts* study, but 
twenty-fii'st in ”Grip the Rat,” a rank difference of When phonemes 

have the same frequency in a sample text, such ties are treated as they are 
in the Spearmn rho test. In ”The Congo,** for example, /i/ and /^/ 
occur in the same number; in calculating P, the ranl?s are averaged to 
29 . 5 . 

Positive rank-differences would seem to have greater perceptual 

^ * 

value than negative ones* That is, readers vdll hardly be av/are of 
the relative absence of a particular phoneme, but greater exploitation 
of a phoneme than might normally be expected v/lll usually be noticed* 
Furthermore, the promotion of rare phonemes will have more impact on 
the sound orchestration of a text than the same rank difference for 
a common sound* Therefore, the perceptual index should take into account 
the normal expectation of a phoneme, as v/cll as its rank difference 
in the text* From these considerations, the following formula has 
been devised for the prominence index* R. denotes the rank in Roberts* 
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count; the rank in the text; the probability for the phoneme 
given by Roberts (values reproduced here in the first column of Table 
10). A factoi* of 100 has beeri iatoodneed to avoid decimals* 

P = 100 (R. » r.) (1.0 . p.) 

3* jL 

The prominence indices for the phonemes in our three texts are given 
in Table 12. 

Ui^t conclusions can be dra\-ni from the values calculated for 
the prominence index? In “Grip the Rat,” most of the index values 
ere lov/, a result that reflects the commonplace nature! of the text 
when set against the expectations of American IUnglish. Only two 
I^onemes differ markedly in prominence from Roberts* coimt: /a/, 
with an index of -»12^0, and /o/ with •^1292• The choice of these 
two phonemes for particular prominence is no accident since, it v/ill 
be recalled, this passage was designed to elicit dialect differences 
from its readers. Such a test passage would necessarily provide 
frequent opportunities for informants to demonstrate their characteristic 
use of low and back vowels since these phonemes are cigiiificant markers 
of regional differences in American speech. High index values for /t^/ 
and /o/ in this passage casi be attributed, to the same cause, and the 
realization of these four phonemic segments in speech will have 
considerable value for the dialectologist. 

The relatively low rho correlations between “Fern Hill” and the 
other samples shown iii Table 11 is easily explained by reference to 
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the index values calculated for this text. Unlike ”llie Congo,” *‘Grip 
the Rat,” and Roberts* study, the transcription of ”Fern Hill” used 
here- is not based on the American Engl3.sh of the Upper Midv;est but on 
Thomas* ov/n reading of his p<>eni. Thie low index value for /r/, -1214, 
reflects the postvocalic r-lessness typical of British speech, while 
a high value for /h/ — +779 •*“ shows the convention of the- “ceiitering 
glide” that a Trager-Sinith notation offers for characterizing this 
r-lessness. Other dialect differences between Thomas* speech and 
American English are also highlighted by the index values for low and 
back vowels, particularly /se./ and /o/. But not all the large index 
values are owing to dialect differences of this sort. Once this 
measure has been coordinated with thematic and syntactic structure, 
the high value for /»y/ caji be traced to Thomas* preference for participial 
constructions (“lilting house,” ”spinning place**) and for the thematic 
repetition of the word *young*. Further aesthetic consequences may 
also emerge from the apparent preference for voiced segments: /a/ over 

/s/| /d/ over /t/, /g/ over AA A/ over /p/, /S/ over / 0 /» Without 
integrat3.on into a full scale analysis of the poern, these facts are 
without consequence. But clearly they ai’e a part of the aesthetic 
impact of the text and must be considered in an examination of the 
structuring of language for artistic ends. 

Indications of phonemic foregroiuidiiig like those found in ”Fern 
Hill” also emerge from an examination of the prominence indices for 

t 

“The Congo.” As a phonemic- counterpart to the drum-beating intended 
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to accompany performances of this poem, Lindsay de-emphasizes front 
vowels — /i/, and /e/, «191 — for the more resonant hack vov/els — 

/u/, +392, and /o/, +640. In addition, the exploitation of these 
back vowels is also reflected in the high index value for the so-called 
back glide in Trager-Smith phoneraics: /w/^ A preference for 

voi.ced consonants, even more extreme than that noted for ’’Fern Hill,” 
is also noticeable in the index values for ’’The Congo," vdth high 
ranks assigned to /b/, /g/, /z/, /Vt /V, and relatively low ones 
to /p/, A/t A/» A/» /o/* Once again, it would be futile to 

infer sense from sound or to suggest that the effects shovm by the 
index have a value independent of the semantic and syntactic organization 
of the poem# Yet, as already noted in Table 11, the promotion in raalt 
shovfli by the index values is higlily consistent in the poem, and the 
phonemic structuring reflected in the P values vividly furthers Lindsay's 

aesthetic ends# 

Our study has revealed that the distribution of phonemes in langi’,ages 
operates v/ithin higlily restrictive limits# Certain high frequency 
phonemes will occur about ten percent of the time, while less frequent 
phonemes occupy quite stable posit3.ons in a decreasing senes# ~able 13 
shows the redundancy values for the various languages discussed in this 
paper; the rather small variation between languages suggests that the 
frequency series occupied by phonemes is relatively fixed, even in the 
deformation of normal language In aphasia# The approximate similarity 
of redundancy values thus suggests that a linguistic universal constratos 
hiunan speech in the deployment of phonemes# But we have also found 




that the particular phonemes chosen to occupy a givori frec^uency can 
vary considerably betv/een textSo The prominence index proposed here 
reflects the promotion or demotion of x-»^ticular phonemes in the 
serlcsy a chai^acteristic that proves to be related not only to dialect 
and language y but also to the aesthetic organization of a text* 

The procedures used in this stu.dy are not a result of positivistic 
yearnings for exactness in all questions of judgment. Instead they 
point toward a rigor of method and definition. Too much of literally 
consmeatary verges tov/ard precision without confronting v/hat it means 
to be precise* This effort to relate statistical techniques and literary 
matters y I hopey points tov/ard the boundary betv/een problem’s of fact 
and questions of opinion.6 Perhaps the words of George Steiner a 
man larticulariy attuned to the amoral possibilities of a criticism 
oblivious to the potency of literary truth — » can serve as a fitting 
close to this study;. ”Hlie shapes of reality and of our imaginative 
grasp are exceedingly difficult to forsee* Nevertheless, the student 
of literature nov/ has access to and responsibility towards a very rich 
terrain y intermediate betv/eea the arts and sciences, a terrain bordering 
equally on poetry, on sociology, on psychology, on logic, and even on 
mathematics” (Steiner, 1965? S6)o 
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